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This  study  Investigates  the  errors  due  solely  to  sampling  intervals 
that  occur  with  radar-derived  total  rainfall  estimates.  The  study  was 
limited  to  nine  cold-frvnt  passages  over  eastern  Texas,  in  the  Fall  of 
1984.  Digitized,  10.3  cm  wavelength  radar  observations  were  recorded 
using  a  one  minute  sampling-rate.  Total  rainfall  estimates,  for  10  km 
by  10  km  areas,  based  on  these  data  were  considered^ground  truth" 
totals. 

Sample-rates,  ranging  from  5  to  60  minutes,  were  applied  to  the  re¬ 
corded  data  to  calculate  total  rain  estimates  for  each  sample  rate. 

* 

These  derived  rain  totals  were  compared  to  the  ground  truth"  totals, 

r  e  -  ' 

with  the  differences  referred  to  as  "errors.  These  errors  were  plotted 
against  the  sampling-rate.  They  ranged  from  over  100%  for  sample  inter¬ 
vals  greater  than  50  minutes,  to  less  than  25%  for  intervals  less  than 
15  minutes.  The  errors  were  also  plotted  against  the  number  of  samples 
taken.  There  was  no  significant  increase  in  estimate  accuracy  when 
greater  than  seven  samples  were  taken  per  80  minute  period. 

Other  variables,  the  mean  rain  rate,  total  rain,  sequential  varia¬ 
bility,  storm  width,  and  storm  speed  of  movement,  were  found  to  have 
very  low  correlations  with  the  errors..  Analyses  of  variances  done  on 
subdivisions  of  the  storm  width,  storm  speed,  and  mean  rain  rate 


variables  proved  Inconclusive  because  of  small,  unbalanced  sample  sizes. 
Regression  analyses  were  used  to  develop  the  "best"  models,  using  error 
as  the  dependent  variable.  The  resulting  equations  relate  the  errors  to 
the  sampling-rate  and  the  number  of  samples  taken.  These  models  were 
then  used  as  predictors  of  the  expected  errors  in  total  rain  estimates. 


The  predictions  are  applicable  to  individual,  10  km  by  10  km  area, 
total  rain  measurements. 
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variables  proved  inconclusive  because  of  small,  unbalanced  sample  sizes. 
Regression  analyses  were  used  to  develop  the  "best"  models,  using  error 
as  the  dependent  variable.  The  resulting  equations  relate  the  errors  to 
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CHAPTER  I 


INTRODUCTION 


Overview 

Radar-derived  values  of  total  rainfall  are,  in  addition  to  many 
other  factors,  a  function  of  the  time  interval  between  radar  samples. 

N 

Most  users  of  such  information  have  little  quantitative  knowledge  re¬ 
garding  the  errors  in  total  rainfall  estimates  that  can  occur  due  to 
variations  in  the  rainfall  rate  during  the  intervals  between  samples. 
Sampling  intervals  usually  range  from  5  to  30  minutes. 

In  addition  to  the  desirability  of  determining  these  errors,  there 
is  a  specific  need  in  the  military  for  such  information.  A  tactical 
military  weather  radar  must  operate  no  longer  than  is  absolutely  neces¬ 
sary  because  it  presents  Itself  as  a  target  through  electromagnetic 
radiation.  Such  a  radar  used  for  military  hydrological  purposes  needs 
to  accumulate  precipitation  data  sufficient  to  derive  the  total  rainfall 
over  the  area  of  interest. 

A  derivation  of  such  errors  due  to  the  variations  in  the  sampling 
interval  is  of  major  importance  to  the  military  as  well  as  the  scienti¬ 
fic  community. 

Objectives 

This  study  was  undertaken  to  identify  and  quantify  the  differences 


This  study  follows  the  style  and  format  of  the  Journal  of  Climate 
and  Applied  Meteorology. 
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between  total  rain  estimates  as  functions  of  the  radar  sample-rates. 

The  total  rain  estimate  derived  using  radar  data  recorded  at  a  one- 
minute  sample-rate  was  assumed  to  be  the  "ground  truth"  estimate.  Total 
rain  estimates  using  other  sampling-rates  were  compared  to  this  "ground 
truth."  The  resulting  differences  are  referred  to  loosely  as  "errors" 
in  this  study.  The  specific  objectives  are  as  follows. 

(1)  Complement  and  extend  the  applicability  of  previous,  similar 
studies  that  were  based  on  only  rain  gage  retwork  data. 

(2)  Use  descriptive  statistical  techniques  to  describe  and  place 
bounds  on  the  expected  error  associated  with  specific  radar  sampling  in¬ 
tervals. 

(3)  Develop  regression  relationships  that  can  be  used  to  predict 
the  expected  error  when  given  a  certain  sampling  interval  and  field 
determinable  parameters,  such  as  storm  depth  and  speed  of  movement. 

Previous  Research 

Wilson  (1964)  showed  that  the  sampling  rate  used  to  observe  preci¬ 
pitation  events  can  contribute  significant  errors  to  the  overall  rain¬ 
fall  estimates.  This  is  patently  clear  to  anyone  but  more  importantly, 
just  what  factors  have  a  bearing  on  these  errors  and  just  how  much  can 
be  attributed  to  each  such  factor? 

The  results  of  this  study  can  be  applied  in  several  meteorological, 
hydrological,  and  agricultural  specialties.  Models  for  weather  modifi¬ 
cation  verification  and  streamflow  or  flood  control  forecasting  use  in¬ 
tegrated  precipitation  over  areas  as  important  inputs  (Larson,  1974; 
McGuiness,  1963).  Outputs  from  these  models  can  be  no  more  accurate 


3 


than  the  Inputs.  Therefore,  any  research  that  can  describe  and  quantify 
the  expected  errors  in  radar  rainfall  measurements  would  be  of  benefit 
(Brandes,  1975;  Jatila  and  Puhaka,  1973a,b). 

In  addition,  a  measurement  of  these  errors  Is  Important  to  the 
survivability  of  tactical  weather  radars  now  In  use.  When  such  a  radar 
Is  operating,  or  active  In  Its  hazardous  battle  environment,  the  radar 
beam  can  act  as  a  homing  beacon  for  enemy  rockets  or  missiles.  Thus, 
the  less  frequently  the  radar  Is  active,  the  better  its  chances  of  sur¬ 
vival.  If  the  weather  officer  has  knowledge  of  the  relative  accuracies 
of  different  sampling  rates,  he  can  then  use  the  minimum  rate  necessary. 
To  achieve  this,  a  delineation  of  the  expected  statistical  bounds  of 
error  for  certain  sampling  intervals  is  needed. 

Previous  studies.  Huff  and  Neill  (1957),  and  Linsley  and  Kohler 
(1951),  looked  at  sampling-rate  caused  errors  with  extensive  raingage 
networks.  Neill  (1953)  worked  with  raingage  data  from  8  storms.  He  re¬ 
lated  the  standard  error  of  the  estimate  to  the  total  storm  rainfall  and 
the  sampling  interval  used.  His  equation  is 

E$  «  4  x  10"3  Rt1,13  T1*29  (1) 

where  E$  is  the  standard  error  of  the  sampling  interval  estimate,  R^  is 
the  total  integrated  rainfall  In  inches,  and  T  is  the  sampling  interval 
in  hours. 

Mueller  (1957)  worked  with  one-minute  data  from  Neill's  8  storms 
plus  12  more,  of  varied  synoptic  types.  He  investigated  a  measure  of 
the  rate  of  change  of  rain  intensity  with  time,  which  he  called  sequen¬ 
tial  variability,  written  as 


(2) 
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where  D  is  the  sequential  variability  in  mm/h,  R  is  the  mean  rainfall 
rate  in  mm/h  for  the  minute  indicated  by  the  subscript  n,  and  N  is  the 
total  number  of  minutes  sampled.  This  quantity  will  be  used  in  this 
study. 

However,  in  considering  the  best  multiple  correlation  coefficient, 
he  concluded  that  a  simple  relationship  between  the  standard  error  of 
the  total  rainfall  estimate,  the  total  mean  storm  rainfall,  and  the 
sampling  interval  was  the  best  estimate  of  sampling  error.  Thus 
Mueller's  equation  is 

Es  *  1.05  x  10“3  Rt*57  T1’54  .  (3) 


This  equation  showed  the  standard  error  to  be  less  dependent  on 
total  storm  mean  rainfall  than  Neill's.  Mueller  attributed  this  to  the 
difference  in  storm  types. 

Huff  (1970)  was  not  concerned  with  sampling  rates  but  related  the 
mean  rainfall  rate  and  the  gage  density  to  the  sampling  error  in  the 
equation 


E  =  -  1.522  R^87  G’52  (4) 

where  E  is  the  sampling  error  in  inches,  Rm  is  the  areal  mean  rainfall 
rate  in  inches  per  hour,  and  G  is  the  gage  density  In  square  miles  per 
gage.  This  was  done  for  29  storm  samples  over  a  gage  network  of  100  mi 
square.  He  determined  that  the  mean  rainfall  rate,  or  intensity,  was  an 
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important  variable  when  assessing  errors. 

Wilson  (1970)  also  used  rain  gages  to  infer  expected  errors  in 
radar  rainfall  estimates  as  functions  of  sampling  interval  and  size  of 
the  integration  area.  While  showing  the  expected  increasing  error  due 
to  increasing  sampling  interval  length,  it  became  apparent  that  there 
was  a  large  effect  caused  by  the  size  of  the  total  integrated  area. 


CHAPTER  II 


PROCEDURE 


Data  Collection 

The  WSR/TAM-1,  10.3  cm  wavelength  radar  was  used  to  collect  the 
rainfall  data.  Digitized  radar  data  were  recorded  for  later  playback 
and  analyses. 

The  physical  range  of  this  stuc(y,  seen  in  Fig.  1,  consisted  of  a 
300  km  by  300  km  area  divided  into  four  quadrants,  with  the  radar  in  the 
center.  This  range  of  150  km  radius  about  the  radar  limited  volume 
filling  or  beam  height  errors.  This  large  area  was  subdivided  into  a 
10  km  by  10  km  grid,  which  is  the  military's  basic  hydrologic  unit. 

With  the  aid  of  a  data  processing  program  a  determination  was  made 
of  the  average  radar  reflectivity  factor  for  each  10  km  by  10  km  grid 
area.  This  average  reflectivity  value  is  then  converted  to  an  instan¬ 
taneous  average  rainfall  rate  for  the  grid  area  with  the  often  used 
relation 


,  .625 

R  * 
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where  Z  Is  the  average  grid  area  reflectivity  factor  in  mm  /m  and  R  is 
the  rainfall  rate  In  mm/h. 

The  study  was  limited  to  cold  front  type  precipitation  events  for 
two  reasons: 

Cl)  This  type  of  system  occurred  most  frequently  in  this  area 
during  the  data  collection  period  from  September  to  November  of  1984. 


Fig.  1.  Illustration  of  the  data  grid  of  10  km  by  10  km  areas. 

The  actual  recorded  data  was  from  nine  different  storms. 

(2)  A  somewhat  homogeneous  type  of  line  shape  of  radar  echo  was 
needed  to  measure  directly  several  of  the  physical  characteristics  of 
the  storms. 


The  storms  were  observed  by  the  radar  and  recorded  at  an  antenna 
rotation  rate  of  1  rpm.  This  allowed  information  on  every  grid  area 
every  60  seconds.  This  rate  was  then  somewhat  comparable,  time-wise 
only,  to  the  previously  mentioned  studies  because  they  used  one-minute 
recording  rain  gages.  The  recorded  one-minute  radar  data  was  then  con¬ 
sidered  "ground  truth"  or  the  best  possible  estimate  of  rainfall.  Total 
rain  estimates  based  on  this  one-minute  data  were  also  considered  the 
"ground  truth"  for  comparisons  with  other  sample-rate  estimates. 

The  1  rpm  recording  rate  forced  an  extrapolation  of  the  rainfall 
rate  data  at  several  of  the  sample  Intervals.  This  extrapolation  became 
necessary  because  the  digitized  radar  tapes  recorded  for  85-90  minutes 
at  this  antenna  rotation  rate.  With  this  time  span  in  mind  it  was  de¬ 
cided  to  process  all  tapes  for  a  uniform  80  minute  time  span.  The 
problem  then  was  how  to  choose  the  sample  intervals  to  use  on  the  data. 
Only  the  intervals  of  5,  10,  20,  and  40  minutes  fit  evenly  into  the  80 
minute  tape  time.  While  the  errors  beyond  the  40  minute  interval  were 
of  interest  it  was  desirable  to  have  more  data  points  at  the  shorter 
sample  Intervals.  If  the  80  minute  observation  was  included  as  the  last 
data  point  for  all  extrapolated  intervals  then  the  stated  sample  inter¬ 
vals  would  not  accurately  reflect  the  actual  time  intervals  with  which 
the  data  was  observed.  In  that  case  the  stated  50  minute  sample  inter¬ 
val  would  actually  consist  of  an  end  observation  interval  of  30  minutes. 
Therefore  it  was  decided  to  extrapolate  the  last  rain  rate  measured  by  a 
full  sample  interval  to  the  80  minute  end  time.  For  example,  the  50 
minute  sample  was  based  on  a  first  observation  at  starting  time,  a 
second  at  the  50  minute  point,  and  then  this  50  minute  rate  was  used  as 


the  rain  rate  at  the  80  minute  end  point.  This  was  thought  to  be  the 
best  way  to  handle  the  dilemna  because  without  actually  taking  a  radar 

observation  at  the  tape  end  point  it  would  not  be  known  if  the  rain  rate 

had  increased  or  decreased,  both  being  equally  possible.  Over  a  large 

sample  the  mean  of  the  data  errors  due  to  these  extrapolations  should  be 

very  small  because  of  the  equal  possibility  of  under-estimating  or  over¬ 
estimating  any  given  rain  rate.  The  sample  intervals  of  5,  10,  15,  20, 
25,  30,  40,  50,  and  60  minutes  were  used  in  this  study.  They  are  shown 
with  their  amounts  of  extrapolated  data  in  Fig.  2. 


Sample 

Interval  (min) 


Fig.  2.  Illustration  of  sample  Interval  points  of  measurement  and 
extrapolation  within  an  80  min  time  span. 


The  selected  sample  intervals  consisted  of  a  set  number  of  samp 
because  of  the  fixed  80  minute  data  tape  time  in  this  study.  Tnes«- 
numbers  of  samples  provide  a  different  measure  of  how  the  sto nr  «*- 
sampled.  The  50  and  60  minute  sample  intervals  are  different  t>ec«-..* 
their  measurements  are  taken  at  different  time  points.  Altnougr  ' 
these  two  intervals  are  compared  by  the  number  of  samples,  they  "*• 
same,  they  both  allow  two  actual  samples  in  the  fixed  time  span  ’n* 
sampling  error  of  a  total  rain  estimate  ultimately  depends  on  now  we' 
the  sampling  technique  can  define  the  temporal  rain  profile.  In  this 
way  the  number  of  samples  is  important  because  they  obviously  have  a 
direct  effect  on  how  well  that  profile  is  defined.  For  this  reason  re¬ 
lating  numbers  of  samples  to  errors  of  rain  estimates  could  develop  use¬ 
ful  predictive  type  relations.  Using  the  selected  sample  intervals  of 
5,  10,  15,  20,  25,  30,  40,  50,  and  60  minutes  allow  17,  9,  6,  5,  4,  3, 

3,  2,  and  2  samples  respectively  for  each  total  rain  estimate  made  in 
the  80  minute  time  span. 

Variable  Selection 

There  are  several  variables  that  could  possibly  help  to  explain  the 
observed  errors  when  increasing  the  sampling  interval.  The  grid  instan¬ 
taneous  rainfall  rate  R  in  mm/h  is  the  basic  unit  used  to  compute  these 
variables.  The  first  three  variables  were  calculated  by  a  program  that 
processed  the  radar  data  and  gave  the  variables  of  interest  for  each 
grid  area  within  one  quadrant.  These  variables  are  as  follows: 

(1)  The  total  integrated  rainfall,  (mm),  is 


***  5*-’c  total  rain  for  the  entire  sampled  time, 
aear  rjinfall  rate,  R  (mm/h),  is 

fl 

s 

V  s 

«• " 

H  • 


(7) 


•r  ^ear  qrn j  rain  rate  over  the  entire  sampled  time. 

s  '***  sequential  variability  D  in  iwn/h  is  the  rate  of  change  of 
*  '  ntens-t/  *un  time.  It  is  shown  in  Eq.  (2). 

’ne  following  three  quantities  were  determined  from  tracings  of  the 
ecnoe  s  first  and  last  position: 

J)  The  horizontal  depth  of  the  precipitation  area,  in  km,  is 
measured  along  the  direction  of  movement.  This  depth  is  the  arithmetic 
mean  of  the  distances  from  the  area's  leading  edge  to  its  training  edge, 
measured  at  the  first  and  last  sample  times. 

(2)  The  horizontal  depth  of  the  sampled  portion  of  the  precipita¬ 
tion  area,  in  km,  is  the  portion  of  the  precipitation  area  that  has 
passed  over  the  sampling  point  during  the  sampling  time.  This  depth  is 
measured  as  the  distance  the  area's  leading  edge  has  moved  during  the 
sampled  time. 

(3)  The  precipitation  area  speed,  in  km/h,  is  the  average  rate  of 
movement  during  the  sampling  time.  The  speed  is  calculated  as  the 
arithmetic  mean  of  the  distinaces  that  the  leading  and  training  edges 
moved  during  the  sampled  time,  divided  by  the  sampled  time. 

Raw  Data  Analysis  and  Products 


In  analyses  of  the  digitized  radar  data  tapes  the  previously 


mentioned  calculations  were  done  for  the  variables  of  interest.  The 
calculated  values  were  then  plotted  for  each  grid  area  over  the  150  km 
by  150  km  quadrant  of  interest. 


A  portion  of  such  a  plot  for  a  sample  interval  of  one  minute  is 
shown  in  Fig.  3.  Such  derived  results  were  plotted  for  each  grid  area. 
The  upper  number  is  the  grid  rain  rate  in  mm/h  at  the  last  sample  time 
while  the  second  number  is  the  mean  grid  rain  rate  in  mm/h  for  the  total 
sampled  time.  The  third  number  in  each  grid  area  is  the  grid's  total 
integrated  rainfall  in  mm  for  the  entire  sampled  time  and  the  lower 
number  in  each  box  is  the  sequential  variability  in  mm/h.  These  values, 
derived  using  the  one-minute  sample  data,  were  then  considered  "ground 
truth"  estimates  for  each  grid  area. 

A  somewhat  similar  plot  was  used  for  each  of  the  sample  intervals 
ranging  from  5  to  60  minutes.  The  upper  number  is  the  rain  rate  in  mm/h 
at  the  last  sample  time  while  the  sample  interval's  total  integrated 
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Fig.  3.  Illustration  of  plotted  variables  within  grid  areas  for 
the  1  min  sample  interval. 
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rain  in  mm  is  the  lower  number.  This  plot  is  shown  in  Fig.  4.  Such  a 
plot  was  generated  for  each  sample  interval. 

The  total  integrated  rainfall  estimate  for  each  sample  interval  was 
then  compared  to  the  "ground  truth"  one-min  interval  total  rainfall  on  a 
grid  by  grid  basis.  From  this  comparison  the  error  was  calculated  and 
expressed  as  the  absolute  percent  error  of  the  total  rain  estimate  for  a 
given  sample  interval.  This  quantity  was  calculated  as 


100  lRt(n)  "  Rt(l)l 

/  -i  \ 


where  APE  is  the  absolute  percent  error,  R^.  is  the  total  integrated  rain 


calculated  for.  the  subscripts  n  and  1  which  refer  to  the  number  of 
minutes  in  the  sample  interval  used. 

The  recorded  radar  data  tapes  were  played  back  to  make  tracings  of 
the  storm  echoes.  Fig.  5  shows  a  typical  tracing  of  the  echo  positions 
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Fig.  4.  Illustration  of  plotted  variables  within  grid  areas  for 
the  5  to  60  min  sample  intervals. 
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-  Echo  position  at  first  data  processing  time 

-  Echo  position  at  last  data  processing  tine 

a  ADATA  data  set  point 
b  BDATA  data  set  point 


Fig.  5.  Illustration  of  overlaid  precipitation  echoes  and  typical 
grid  areas  selected  as  data  points. 
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at  the  first  data  processing  time  (dashed  line)  and  the  last  processing 
time  (solid  line).  The  tracings  were  constrained  to  enclose  areas  of 
precipitation  rates  greater  than  2.4  nn/h.  These  tracings  grid  areas 
were  selected  along  the  storm's  leading  edge.  Two  different  sample 
sets  of  data  were  selected  from  each  storm.  One  sample  set,  ADATA,  was 
selected  with  the  condition  that  the  grid  areas  were  within  the  precipi¬ 
tation  area  for  the  entire  sampled  time.  This  set  was  comprised  of 
values  from  103  grid  areas.  The  other  set,  BDATA,  had  grid  areas  under 
the  traced  echo  at  the  beginning  sample  time  but  not  always  under  the 
echo  at  the  last  sampled  time.  This  data  set  had  89  grid  areas. 

Since  the  mean  rain  rates  were  calculated  by  averaging  over  the 
total  sampled  time,  rain  rates  in  BDATA  are  not  valid  measurements.  The 
only  other  difference  between  the  sets  was  that  BDATA  grid  areas  were 
generally  10  to  20  km  further  into  the  storm,  away  from  the  leading 
edge.  The  two  data  sets  were  assumed  to  be  independent  samples.  Two 
large  data  sets  were  useful  for  comparisons  of  mean  errors. 

Huff  (1970)  stated  that  rainfall  measurement  variables  are  not 
normally  distributed.  This  makes  statistical  analyses  difficult  because 
analyses  of  variance  and  multiple  comparison  tests  require  assumptions 
of  normality  and  equal  variances  (Ostle  and  Mensing,  1982).  Huff  (1970) 
found  that  using  log  transofrmations  of  the  rainfall  variables  was  the 
best  method  of  approaching  normalization  of  the  data.  Natural  log 
transformations  of  the  rainfall  errors  and  rainfall  variables  were  used 
for  all  statistical  analyses  of  the  data  in  this  study.  All  statistical 
testing  in  this  study  was  done  at  the  95%  significance  level. 

The  total  sample  in  numbers  of  grid  areas  was  very  large  by 


o'.y.' 


statistical  standards.  The  data  were  from  192  grid  areas.  The  errors 
were  determined  for  nine  different  sample  intervals  for  each  area,  or  a 
total  of  1728  calculated  errors.  For  a  sample  this  large  the  results 
can  be  generalized  fairly  accurately  with  descriptive  statistics  (Ostle 
and  Mensing,  1982). 


CHAPTER  III 


STATISTICAL  ANALYSES 


Descriptive  Statistics 

The  absolute  percent  error  described  in  Eq.  (8)  was  decidedly  the 
most  practical  measure  of  the  difference  between  a  certain  sample  inter¬ 
val's  total  rain  estimate  and  the  1  minute  sampled  "ground  truth"  total 
rain.  The  percentage  part  of  this  type  of  measurement  scaled  the  errors 
in  an  important  way  that  made  the  errors  of  a  light  rainfall  comparable 
to  that  of  a  heavy  rain.  Taking  the  absolute  value  of  the  percentage 
error  was  necessary  because  there  is  no  possible  way  of  knowing  if  a 
radar  is  under-estimating  or  over-estimating  the  rainfall  at  any  speci¬ 
fied  point  and  time.  Thus,  in  interpreting  the  results  presented  here 
it  must  be  kept  in  mind  that  the  true  errors  could  be  positive  or  nega¬ 
tive. 

The  arithmetic  mean,  over  all  grid  areas,  of  the  absolute  percent 
errors  for  each  sampling  interval  and  their  associated  standard  devia¬ 
tions  were  calculated  for  each  data  set.  A  summary  of  each  individual 
data  set's  statistical  measures  of  the  errors  are  shown  in  Table  1. 

This  table  shows  that  the  BDATA  set  of  observations  had  larger  mean 
errors  than  the  ADATA  set,  at  all  sample  intervals  except  10  min.  The 
BDATA  set  also  had  larger  standard  deviations  at  all  sample  intervals 
except  10  and  20  min. 

Are  the  two  data  set's  mean  errors  of  total  rain  estimates  statis¬ 
tically  different?  The  sets  came  from  different  grid  areas  within  nine 
storms.  If  it  can  be  shown  that  they  are  not  significantly  different 


Table  1.  Summary  of  statistical  values  of  absolute  percent  error 
by  sample  interval  (min)  for  the  individual  data  sets. 


Sample 

Minimum 

Mean 

Maximum 

Standard 

deviation 
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error 
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error 

error 

lw\ 
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I* 


Data 

set  =  ADATA 

1.23 

5.7 

6 

1 

4.56 

26.1 

6 

5 

7.52 

38.5 

6 

7 

11.77 

131.4 

4 

14 

14.53 

71.2 

9 
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23.43 

138.6 

0 

22 

29.49 

104.9 

7 

24 

41.09 

128.8 

5 

2E 

42.55 

106.3 

6 

27 

Data 

set  *  BDATA 

then  one  plot  of  combined  mean  data  would  be  more  representative  of  the 
recorded  data  because  the  sample  size  would  be  effectively  doubled.  An 
analysis  of  variance  was  used  to  compare  the  mean  errors  of  the  data 
sets.  This  statistical  test  assumes  equal  variances  and  a  normal  dis¬ 
tribution,  thus  it  required  the  log  transformation  of  the  errors.  The 
log  transformed  errors  were  then  tested  in  two  different  ways. 


First  a  t-test  was  done  by  sample  intervals  to  compare  the  mean 
errors  of  the  sets.  The  ADATA  set  had  103  observations  averaged  for 
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each  sample  interval,  while  for  the  BDATA  set  each  mean  was  of  89  obser¬ 
vations.  The  results  of  the  tests  are  shown  in  Table  2.  The  overall 
results  of  this  test  show  the  highest  t-statistic  to  have  a  level  of 
significance  of  p  *  .1676.  From  this  it  was  concluded  that  there  were 
no  significant  differences  between  the  mean  errors  of  the  two  data  sets 
when  they  were  compared  by  sampling  intervals. 

The  second  t-test  was  a  comparison  of  each  data  set's  mean  error 
averaged  over  all  sample  intervals,  shown  as  the  lower  line  in  Table  2. 
The  results  were  a  t  =  0.8369  with  a  level  of  significance  of  p  * 

0.4028.  The  conclusion  was  that  there  was  no  significant  difference  in 
the  overall  mean  errors  of  the  data  sets. 


Table  2.  ADATA  and  BDATA  t-test  comparisons  of  mean  absolute  per 
cent  errors  by  sample  interval  and  by  overall  data  set  means. 
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Sample 

interval 

(min) 

t-statlstic 

p-value 

Conclude 

means 

are: 

5 

-1.3852 

.1676 

EQUAL 

10 

.2598 

.7953 

EQUAL 

15 

-  .3597 

.7195 

EQUAL 

20 

-  .7735 

.4402 

EQUAL 

25 

-  .5207 

.6032 

EQUAL 

30 

-  .4955 

.6208 

EQUAL 

40 

-1.3630 

.1745 

EQUAL 

50 

.9749 

.3310 

EQUAL 

60 

-  .4762 

.6345 

EQUAL 

From  the  above  t-tests  it  is  seen  that  the  mean  errors  of  total 
rainfall  estimates  of  the  two  data  sets  were  not  statistically  dif¬ 
ferent.  This  then  allows  the  sets  to  be  combined  and  a  mean  error  of 


the  rain  estimates  calculated  for  each  sample  interval.  These  overall 
mean  errors  were  plotted  in  Fig.  6.  The  actual  statistical  values  for 
the  combined  data  set  are  listed  in  Table  3.  For  such  a  large  sample 


Fig.  6.  Plot  of  mean  absolute  errors  by  sample  interval.  Data 
were  the  combined  ADATA  and  BDATA  sets,  192  observations  per  plotted 


Table  3.  Summary  of  the  statistical  values  of  absolute  percent 
error  by  sample  interval  (min)  for  the  combined  data  set. 
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error 
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Mean 

error 
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error 

(%) 
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Data  set  =  combined  data 


5 

0.00 

1.45 

32.31 

2.49 

10 

0.00 

4.45 

26.22 

5.04 

15 

0.00 

8.02 

51.33 

8.37 

20 

0.07 

12.28 

131.44 

13.14 

25 

0.00 

16.21 

82.60 

15.62 

30 

0.10 

24.93 

177.43 

25.32 

40 

0.66 

31.61 

149.18 

27.60 

50 

0.24 

41.99 

181.39 

33.95 

60 

0.30 

45.44 

254.88 

34.33 

size  the  errors  are  assumed  to  be  normally  distributed.  Thus,  it  can  be 
said  that  95%  of  the  absolute  errors  measured  would  be  expected  to  lie 
within  limits  described  as 

95%  of  the  absolute  errors  S  APE  +  1.96  s  ,  (9) 

where  APE  is  the  mean  absolute  percent  error  and  s  is  the  associated 
standard  deviation.  This  limit,  when  plotted  for  each  sample  interval, 
results  in  the  upper  curve  in  Fig.  6.  The  curves  were  fit  to  the  data 
with  a  SAS  cubic  regression  drawing  routine.  Since  the  combination  of 
data  sets  almost  doubled  the  sample  size  this  plot  is  probably  more 
representative  than  either  of  the  separate  set's  curves  taken  indivi¬ 
dually. 

In  Fig.  6  the  mean  errors,  the  lower  curve,  increase  as  expected 
with  Increasing  sampling  interval.  The  lower  slope  of  the  curve  at 
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larger  sample  intervals  may  be  due  to  the  extrapolation  technique  used. 
It  must  be  remembered  that  the  50  minute  interval  had  30  minutes  of 
extrapolated  data,  whereas  the  60  minute  interval  had  only  20  minutes. 
Thus,  somewhat  less  confidence  can  be  placed  in  the  curves  beyond  the 
40  minute  point  on  all  plots  that  use  the  sample  interval  as  the 
abscissa. 

With  the  above  stated  cautions  applied.  Fig.  6  and  Table  3  can  be 
used  to  approximate  mean  errors  from  other  large  samples.  The  mean 
errors  summarized  here  are  very  similar  to  those  found  by  Wilson  (1970) 
in  a  study  of  convective  storm  radar-derived  rain  total  estimates.  By 
comparison  he  found  45%,  25%,  and  13%  mean  error  at  the  60,  30,  and  15 
minute  sample  intervals  respectively.  In  Table  3  the  combined  data 
shows  that  these  errors  compare  favorably  with  the  45%,  25%,  and  8%  mean 
errors  found  in  this  study.  The  upper  curve  in  Fig.  6  defines  the  upper 
limit  of  the  area  within  which  the  errors  would  fall  for  95  out  of  100 
estimates  with  a  specific  sample  interval.  Nineth-five  percent  of  the 
errors  of  total  rain  estimates  should  be  less  thant  50%  if  a  sample  in¬ 
terval  of  25  minutes  is  used.  To  be  within  25%  error  an  interval  of  15 
minutes  is  necessary.  To  keep  the  absolute  percent  error  of  a  total 
rain  estimate  less  than  100%  a  maximum  sample  interval  of  approximately 
46  minutes  would  be  necessary.  As  a  rough  check  on  the  statistical 
accuracy  of  the  upper  curve  in  Fig.  6,  It  was  found  that  96.01%  of  the 
measured  errors  fell  within  Its  limits.  This  fact  reinforces  the  vali¬ 
dity  of  the  curve  and  our  basic  assumptions. 

The  combined  data  set's  errors  were  also  related  to  the  number  of 
samples  taken  in  the  80  minute  time  span  of  observations.  At  each  of 
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The  plots  show  the  striking  lack  of  difference  in  errors  for  num¬ 
bers  of  samples  greater  than  eight.  The  mean  error  from  this  point  and 
greater  is  less  than  approximately  5%.  Evidently  this  is  a  point  at 
which  the  rainfall  temporal  profile  becomes  fairly  well  defined.  In¬ 
creasing  the  number  of  samples  beyond  this  number  has  little  effect  in 
decreasing  the  errors  because  the  rain  profile  is  almost  as  well  defined 
as  can  be.  The  mean  errors  increase  dramatically  when  less  than  six 
samples  are  taken.  If  only  two  samples  are  taken  in  an  80  minute  span 
the  error  can  be  expected  to  be  less  than  approximately  110%  in  95  out 
of  100  cases.  If  an  error  of  less  than  50%  or  25%  is  required  the  upper 
curve  indicates  that  at  least  4  or  6  samples,  respectively,  would  be 
necessary  during  an  80  minute  observation  period. 

Correlations  and  Analyses  of  Variance 

Previous  investigators  have  been  cited  that  established  several 
variables  important  to  the  explanation  of  errors  of  total  rain  esti¬ 
mates.  This  present  study  has  beasured  or  calculated  values  for  the 
independent  variables  of  total  rain,  mean  rain  rate,  sequential  varia¬ 
bility,  storm  depth,  storm  speed,  and  storm  sampled  depth  for  each  grid 
area.  The  other  variables  are,  of  course,  the  sample  interval  with 
which  the  storm  is  observed  and  the  number  of  samples  taken  of  the 
storm. 

This  section  deals  with  finding  the  relative  importance  of  each  of 
the  variables  in  relation  to  the  errors  of  rainfall  estimates.  The  next 
step  is  categorizing  or  subdividing  the  variables  into  groups  to  further 
delineate  the  errors. 


Which  variables  have  a  significant  effect  or  are  highly  correlated 
with  the  measured  errors?  A  correlation  analysis  (SAS  Institute  Inc., 
1982b)  was  accomplished  to  determine  the  relations  of  the  independent 
variables  to  the  mean  errors.  These  analyses  were  done  only  on  the 
ADATA  set  because  of  inclusion  of  the  mean  rain  rate  variable.  The  log 
of  the  sample  interval  and  the  log  of  the  number  of  samples  clearly  had 
the  highest  correlation  coefficients,  .73  and  -.73  respectively,  with 
the  log  of  the  absolute  percent  errors.  The  next  highest  correlation 
coefficient  with  the  error  was  .21,  with  the  sequential  variability. 
One-minute  rain  totals  and  mean  rain  rates  have  similar  correlation  co¬ 
efficients  of  .12.  The  last  three  variables,  the  precipitation  area 
characteristics,  are  all  slightly  negatively  correlated  to  the  error, 
with  correlation  coefficients  from  -.02  to  -.05. 

It  is  obvious  that  the  sample  interval  selected  and  the  number  of 
samples  taken  would  have  the  most  effect  on  the  rain  estimate's  errors. 
The  other  variables  can  not  be  dismissed  outright  as  unimportant  because 
of  the  wide  ranges  of  the  measured  variables.  The  sequential  variabili¬ 
ty  varied  by  a  factor  of  nearly  50  in  its  untransformed  state,  while  the 
others  varied  from  a  factor  of  5  to  over  25.  It  may  be  that  each 
variable  is  more  or  possibly  less  correlated  with  the  mean  errors  de¬ 
pending  on  where  in  the  ranges  the  measurements  were  made.  For  example, 
were  the  large  mean  errors  associated  with  large  rain  rates,  or  faster 
moving  storm  areas?  Even  though  this  study  only  included  storms  of  a 
cold-front  type,  Huff  (1970)  found  great  variability  within  storms  of 
the  same  synoptic  type. 

In  view  of  this  variability  it  was  decided  to  categorize  the 


independent  variables  in  an  attempt  to  further  investigate  the  factors 
to  which  the  sampling  errors  could  be  attributed.  Disregarding  the' 
variables  of  sampling  technique,  the  mean  rain  rate  of  a  storm  area  is 
the  only  one  of  the  next  three  largest  correlations  that  is  roughly 
estimable  with  one  or  two  radar  scans,  so  it  was  a  candidate  for  sub¬ 
division  into  categories.  The  area  sampled  depth  and  the  area  speed 
were  found  to  be  highly  correlated  with  a  correlation  coefficient  of 
.992.  This  was  known  beforehand  because  the  speed  was  used  to  calculate 
the  sampled  depth.  The  speed  was  the  selected  variable  here  because  it 
could  be  determined  more  directly  with  at  least  two  radar  scans  of  an 
area.  The  storm  depth  was  also  categorized  because  it  was  not  highly 
correlated  with  any  of  the  other  variables. 

The  subdivision  of  the  selected  variables  was  done  subjectively  by 
investigating  the  ranges  of  each.  The  variables  were  categorized  into 
three  subdivisions  each.  The  results  of  these  analyses  were  plotted 
using  a  SAS  cubic  regression  routine  to  draw  the  lines -of -best-fit. 

The  mean  rainfall  rate  was  divided  into  rates  of  less  than  2.4  mm/h 
(light),  with  2.4  and  12  mm/h  (medium),  and  greater  than  12  mm/h 
(intense).  There  were  180  observations  in  the  light  class,  630  in  the 
medium  class,  and  114  in  the  intense  rain  rate  class.  The  mean  errors 
for  each  subdivision  were  calculated  for  each  sample  interval.  These 
calculations  had  20  observations  per  mean  error  in  the  light  class,  70 
in  the  medium  class,  and  13  in  the  intense  class.  The  results  were 
plotted  and  are  shown  in  Fig.  8.  The  outstanding  difference  in  the  fig¬ 
ure  is  that  the  light  rain  rate  class  appears  to  have  considerable  less 
error  than  the  other  classes  at  large  sample  intervals.  Statistical 
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testing  was  done  to  see  If  the  classes  differed  significantly. 

An  analysis  of  variance  was  done  to  compare  the  mean  log  trans¬ 
formed  errors  between  the  subdivisions  of  rain  rates.  This  test  calcu¬ 
lated  an  F  value  of  7.85  and  a  level  of  significance  of  p  ■  .0004.  This 
pointed  to  a  significant  difference  between  the  mean  errors  of  at  least 
two  classes  of  rain  rates.  Fisher's  least  significant  difference  t-test 
was  applied  to  test  which  means  differed.  The  results  showed  the  over¬ 
all  mean  errors  of  the  light  rate  class  differed  significantly  and  were 
less  than  either  of  the  other  classes.  The  more  conservative  Tukey's 
studentized  range  test  found  the  light  rain  rate  significantly  less  than 
only  the  medium  rate. 

To  further  pinpoint  the  rain  rate  class  differences  another  analy¬ 
sis  of  variance  was  run  on  the  data  by  sample  intervals.  If  this  test 
indicated  significant  differences  between  rates  of  classes  for  a  certain 
sample  interval  then  Fisher's  and  Tukey's  tests  were  also  run  on  the 
data.  The  analysis  of  variance  indicated  significant  differences  for 
only  the  5,  20,  and  60  minute  intervals. 

With  a  sample  interval  of  5  minutes  the  analysis  calculated  an  F 
value  of  7.00  and  a  level  of  significance  of  p  *  .0014,  indicating  a 
significant  difference.  Fisher's  test  then  showed  that  the  medium  rate 
errors  differed  from  the  light  and  intense  rate  categories.  Tukey's 
test  found  the  medium  rate  errors  to  differ  only  from  the  light  rates. 
Since  in  this  interval  tl  mean  error  was  less  than  1.5%  and  the  great¬ 
est  single  error  was  less  than  6%.  These  differences  for  the  5  minute 
interval  are  of  minor  importance. 

The  analysis  of  variance  on  the  20  minute  sample  interval 
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calculated  an  F  value  of  4.36  and  a  level  of  significance  of  p  =  .0152, 
indicating  significant  differences.  With  Fisher's  and  Tukey's  tests  on 
this  interval  the  light  rate  errors  were  found  to  be  significantly  less 
than  the  medium  and  intense  rates. 

Finally,  an  analysis  of  variance  run  on  the  60  minute  sample  inter¬ 
val  calculated  an  F  value  of  5.86  and  a  level  of  significance  of  p  = 
.0039,  again  indicating  a  significant  difference  in  mean  errors  of  the 
rain  rate  classes.  Further  testing  with  Fisher's  test  showed  the  light 
rain  rate  errors  differ  from  the  medium  and  intense  rate  errors. 

Tukey's  test  showed  that  the  light  rate  errors  differ  only  from  the 
medium  rate  errors. 

In  summary  the  light  rain  rate  class  had  several  significant 
differences.  In  Fig.  8  it  appears  that  the  light  rain  rate  class  had 
less  mean  error  than  the  other  classes  at  every  interval.  The  analysis 
of  variance  performed  on  the  mean  errors  of  these  subdivisions  found  the 
light  rain  rate  to  be  significantly  less  than  the  other  classes.  Yet, 
when  the  light  rain  rate  class  was  further  tested  by  sample  interval  it 
was  found  to  be  significantly  less  in  only  the  20  and  60  minute  inter¬ 
vals.  There  is  an  imbalance  In  sample  sizes  in  the  rain  rate  classes  by 
sample  interval  as  the  light  rain  rate  class  mean  was  of  20  observa¬ 
tions,  the  medium  mean  was  of  70,  and  the  intense  mean  was  of  only  13. 

If  this  same  analysis  was  run  on  a  more  balanced  class  structure,  each 
with  at  least  30  observations,  the  results  would  be  more  conclusive.  As 
it  stands  the  tests  found  only  two  out  of  nine  sample  interval's  mean 
errors  in  the  light  rain  rate  class  and  the  overall  class  mean  error  to 
be  significantly  less  than  the  other  classes.  With  sample  sfze  kept  in 


31 


mind  it  can  not  be  concluded  that  the  difference  in  the  light  rain  rate 
class  is  of  overall  significance. 

The  next  classifications  of  the  data  were  by  precipitation  area 
depths.  Fig.  9  is  a  plot  of  the  mean  absolute  percent  error  against  the 
sample  intervals  for  the  three  subdivisions  of  area  depth.  The  classes 
were  by  precipitation  area  depths  greater  than  80  km  (wide),  depths 
within  40  km  to  80  km  (medium),  and  depths  of  less  than  40  km  (thin). 
There  were  351  observations  in  the  wide  class,  360  in  the  medium  class, 
and  216  in  the  thin  class.  The  analysis  of  variance  performed  on  the 
mean  errors  of  these  classes  resulted  in  an  F  value  of  1.78  with  a  level 
of  significance  of  p  *  .1699.  This  indicated  no  significant  differences 
between  the  mean  errors  of  these  precipitation  area  depth  classes.  In 
Fig.  9  the  mean  errors  show  no  large  differences  for  any  class  at  any 
specified  interval.  In  the  plot  there  is  a  mild  overall  tendency  for 
the  thin  depths  to  have  greater  errors  and  the  wide  depths  to  have  less. 
Synoptical ly  this  probably  relates  to  the  fact  that  more  turbulent 
activity  may  often  be  associated  with  a  smaller,  thinner  line  type  of 
precipitation  event  and  more  steady,  stratiform  type  rain  associated 
with  a  wider  depth  storm  area.  Overall,  the  statistical  testing  did 
not  show  these  depth  class  error  differences  to  be  significant.  Evi¬ 
dently  classifications  by  storm  depth  do  not  result  in  the  greater 
definition  of  the  mean  errors  of  total  rain  estimates. 

The  last  subdivisions  of  the  data  were  by  the  precipitation  area's 
speed.  The  divisions  were  for  area's  speed  greater  than  50  km/h  (fast), 
for  speeds  within  30  km/h  to  50  km/h  (medium),  and  for  speeds  less  than 
30  km/h  (slow).  This  divided  the  total  number  of  observations  into 
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groups  of  90  in  the  fast  class,  423  in  the  medium  class,  and  414  in  the 
slow  class.  The  analysis  of  variance  performed  on  mean  errors  by  class 
produced  an  F  value  of  .250  with  a  level  of  significance  of  p  =  .7775. 
This  result  indicated  no  significant  differences  between  class  mean 
errors  for  the  speed  subdivisions.  The  mean  absolute  percent  errors  of 
these  classes  were  plotted  against  the  sample  intervals  in  Fig.  10.  In 
the  figure  the  main  difference  in  the  respective  curves  is  that  the 
fast  class,  at  longer  sample  intervals  has  less  mean  error  than  the 
other  classes.  This  may  be  due  indirectly  to  the  imbalance  of  class 
sizes.  This  small  sample  of  fast  moving  areas  could  be  unrepresentative 
of  the  population.  The  fast  class  had  less  than  one-fourth  the  observa¬ 
tions  of  the  other  classes.  For  this  reason  there  can  be  less  confi¬ 
dence  placed  in  conclusions  about  the  fast  class.  It  appears  that  sub¬ 
dividing  rainfall  data  into  area  speeds  does  nothing  to  help  explain 
more  of  the  mean  errors  of  total  rain  estimates. 

Regression  Analyses 

Regression  analyses  are  tools  for  further  examining  and  even  pre¬ 
dicting  the  mean  errors  associated  with  different  sampling  measures.  A 
correlation  analysis  is  a  starting  point  for  selecting  possible  varia¬ 
bles  to  include  in  a  regression.  Since  mean  rain  rate  was  a  variable  of 
interest  the  analysis  was  done  using  the  ADATA  data.  Error  and  rainfall 
measurements  were  log  transformed  to  approach  normalization.  It  is  ob¬ 
vious,  from  the  correlations  previously  stated,  that  the  sample  inter¬ 
val,  the  number  of  samples,  the  sequential  variability,  and  the  rainfall 
measurements  were  important  in  explaining  the  errors  of  total  rain 


estimates.  There  was  little  correlation  between  the  precipitation 
area's  physical  characteristics  of  speed  or  depth,  and  the  errors.  The 
higher  correlations  of  the  sample  intervals  and  numbers  of  samples  to 
the  errors  point  to  their  greater  importance  in  a  regression  relation. 

In  addition  to  the  eight  previously  explained  variables  that  were 
discussed  earlier,  it  was  felt  that  several  other  variable  products 
could  have  been  important.  Variables  that  could  be  estimated  with  two 
or  less  radar  scans  would  be  useful  in  a  predictive  regression  equation. 
The  sample  interval,  area  depth,  area  speed,  and  mean  rain  rate  were 
candidates  for  the  variable  products.  Since  these  variables  could  be 
determined  or  estimated  before  a  storm  was  over  the  area  of  interest, 
these  variables  were  of  use  in  a  predictive  sense.  Included  in  the  re¬ 
gressions  were  products  of  the  sample  interval  with  the  speed,  rain 
rate,  and  area  depth,  products  of  the  rain  rate  with  the  speed  and 
depth,  and  the  product  of  the  speed  and  depth. 

The  next  step  was  to  do  simple  linear  regressions  with  a  procedure, 
RSQUARE  (SAS  Institute  Inc.,  1982a),  that  does  all  possible  regressions 
between  the  different  conbinations  of  the  independent  variables  and 
given  variables.  The  ADATA  data  were  used  because  of  their  valid  rain 
rate  measurements.  The  procedure  was  run  first  on  the  variables,  than 
on  their  natural  log  transforms,  and  finally  on  the  variable  products. 
Table  5  summarized  the  results  of  the  variable's  association  with  the 
log  of  the  absolute  percent  error. 

In  Table  5  the  multiple  correlation  coefficients,  R,  are  an  ex¬ 
pression  of  the  degree  of  association  between  the  variables  in  the 

o 

model.  The  coefficient  of  determination,  R  ,  when  multiplied  by  100  is 


Table  5.  Coefficients  of  determination  and  correlation  coeffi¬ 
cients  for  variable's  association  with  the  log  of  the  absolute  percent 
error  as  the  dependent  variable. 


Independent  variables 


Area  speed,  SPD 
Sampled  depth,  SAMPDEP 
Area  depth,  STRMDEP 
Total  rain,  R^ 

Mean  rain rate,  Ra 
Sequential  variability,  D 
Number  of  samples,  N 
Sample  interval ,  T 

Ln  (Mean  rain  rate),  LRA 
Ln  (Total  rain),  LRT 
Ln  (Seq.  variability),  LO 
Ln  (Number  of  samples),  LN 
Ln  (Sample  interval),  LT 

SPD  X  STRMDEP 
Ln  (Ra  X  SPD) 

Ln  (Ra  X  STRMDEP) 

T  X  SPD 
T  X  STRMDEP 
Ln  (Ra  X  T) 


the  percent  of  variance  in  the  dependent  variable  that  can  be  explained 

by  the  independent  variable  or  model.  The  table  clearly  shows  the 

sample  interval  and  the  number  of  samples  taken  to  be  the  most  important 

variables  in  any  regression  used  to  predict  errors.  The  natural  log 
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transformed  values  of  these  variables  have  greater  R  values  than  do  the 
untransformed  values,  thus  the  transformed  values  should  be  used  in  a 


final  regression  model . 


The  sequential  variability  is  seen  as  the  next  important  variable 
for  explaining  the  errors.  Its  transformed  value  shows  a  R  of  0.2145. 
This  evidently  makes  it  a  more  important  value  than  either  the  rainfall 


variables  or  their  transforms.  This  differs  from  Mueller's  (1957) 
findings  that  related  the  standard  error  of  raingage  estimates  to  the 
same  type  variables  used  in  this  study.  He  concluded  that  the  best 
relationship  existed  between  the  standard  error  of  the  rainfall  estimate 
with  the  sample  interval  and  the  total  rainfall,  instead  of  with  the 
sample  interval  and  the  sequential  variability.  Mueller's  total  rain 
was  for  the  entire  storm  time,  which  is  significantly  different  from  the 
total  rain  in  this  study.  Here  the  total  rain  was  measured  for  only 
80  minutes,  which,  may  or  more  often  may  not  have  been  the  entire  storm 
time.  This  also  means  that  in  this  study  rain  along  the  storm's  leading 
edge  was  more  often  measured.  The  leading  edge  of  a  cold-front  may 
structurally  be  more  active  and  turbulent,  resulting  in  a  more  erratic 
temporal  rain  profile  and  thus  a  higher  sequential  variability  with 
generally  larger  sampling  errors.  This  may  account  for  the  greater 
relative  importance  of  the  sequential  variability  in  this  study. 

The  mean  rain  rate  and  the  total  rain  appear  to  be  of  relatively 
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less  importance  in  Table  5,  as  indicated  by  their  R  values.  They  ex¬ 
plain  a  maximum  of  only  1.5X  of  the  error's  variability  even  when  trans¬ 
formed.  These  variables  were  not  considered  important  enough  to  be 
included  in  the  final  regression  model. 

The  precipitation  area's  physical  characteristics  of  speed,  depth 
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and  sampled  depth  have  very  small  R  values.  These  measurements  are 
then  nearly  useless  in  helping  to  explain  the  errors  in  total  rainfall 
estimates.  For  this  reason  these  variables  were  not  log  transformed  for 

further  work  In  obtaining  the  final  regression  equation. 
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Table  5  has  very  small  R  values  for  all  of  the  attempted  variable 


products.  All  products  involving  the  sample  interval  have  much  smaller 
2 

R  values  than  do  the  sample  interval  by  itself.  Since  the  products  and 
the  interval  are  mutually  exclusive  in  a  regression,  only  the  sample 
interval  should  be  used.  The  remaining  products  of  speed,  depth  and 
rain  rate  explain  less  than  a  maximum  of  0.7%  of  the  error  variability. 
Therefore,  they  are  of  no  use  in  a  final  regression  equation. 

Table  6  shows  the  complete  result  of  all  possible  regressions  on 
the  log  transformed  model  with  the  relatively  important  variables.  The 
sample  interval  and  the  number  of  samples  are  just  two  methods  of  quan¬ 
tifying  the  radar  sampling  of  a  storm,  therefore  they  are  mutually  ex¬ 
clusive  in  a  regression.  This  means  that  there  is  a  "best"  model  for 
each  of  these  variables.  In  Table  6  the  log  of  either  sampling  method 
alone  accounts  for  approximately  53%  of  the  error  variability.  Adding 
the  sequential  variability  to  either  sampling  method  adds  less  than  5% 
to  the  explained  variability.  The  addition  of  either  rainfall  variable 

to  the  model  with  sampling  method  and  sequential  variability  increases 

2 

the  R  by  nearly  7  percent.  Including  both  rainfall  parameters  in  that 
model  results  in  approximately  a  9  percent  increase  in  explained  vari¬ 
ance  of  the  errors. 

These  results  were  then  used  to  construct  the  "best"  regression 
models.  It  was  immediately  obvious  that  the  sample  interval  and  the 
number  of  samples  were  the  variables  of  most  importance  and,  a  separate 
model  would  have  to  be  made  for  each.  The  two  resulting  equations  would 
allow  prediction  of  the  sampling  errors  by  determining  a  priori  either 
the  sampling  interval  desired  or  the  number  of  samples  possible  in  a 
specified  storm.  The  sequential  variability  and  the  total  rainfall  were 
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Table  6.  All  possible  regressions  on  dependent  variable  LAPE, 
natural  log  of  the  absolute  percent  error. 


Number  in  model 

R2 

Variables  in  model 

1 

0.015 

LRA 

1 

0.015 

LRT 

1 

0.046 

LD 

1 

0.527 

LN 

1 

0.529 

LT 

2 

0.110 

LD  LRA 

2 

0.191 

LRT  LRA 

2 

0.244 

AD  LRT 

2 

0.527 

LN  LRT 

2 

0.529 

LT  LRT 

2 

0.530 

LT  LN 

2 

0.542 

LN  LRA 

2 

0.544 

LT  LRA 

2 

.  0.573 

LN  LD 

2 

0.575 

LT  LD 

3 

0.244 

LD  LRT  LRA 

3 

0.530 

LT  LN  LRT 

3 

0.545 

LT  LN  LRA 

3 

0.576 

LT  LN  LD 

3 

0.586 

LT  LRT  LRA 

3 

0.587 

LN  LRT  LRA 

3 

0.637 

LN  LD  LRA 

3 

0.638 

LT  LD  LRA 

3 

0.646 

LT  LD  LRT 

3 

0.646 

LN  LD  LRT 

4 

0.588 

LT  LN  LRT  LRA 

4 

0.640 

LT  LN  LD  LRA 

4 

0.647 

LT  LN  LD  LRT 

4 

0.659 

LT  LD  LRT  LRA 

4 

0.660 

LN  LD  LRT  LRA 

5 

0.661 

LT  LN  LD  LRT  LRA 

seen  to  be  somewhat  important  but  are  only  known  after  the  fact.  They 
should  be  included  in  a  type  of  equation  for  post  event  analyses  but 
they  are  not  useful  in  the  predictive  sense.  Since  a  main  objective  of 


l<<* 


this  study  was  to  develop  predictive  regression  equations  it  was  decided 
to  not  include  the  sequential  variability  or  the  total  rain  in  further 
modeling.  The  mean  rain  rate  can  be  roughly  estimated  with  one  or  two 
radar  scans,  therefore  would  be  useful  in  a  predictive  equation.  In 
Table  6  the  addition  of  the  mean  rain  rate  to  either  sampling  model  only 

p 

resulted  in  an  increase  of  approximately  0.015  in  R.  This  slight 
amount  of  improvement  in  the  model  was  by  far  outweighed  by  the  fact 
that  radar  scans  had  to  be  made  for  even  a  rough  estimate  of  the  rain 
rate.  For  this  reason  the  rain  rate  was  not  included  in  the  final 
model,  which  then  allowed  estimation  of  errors  without  turning  on  the 
radar. 

The  two  "best"  regression  models  for  the  stated  objectives  were 
then  of  the  simple  single  variable  type.  One  model, 

LAPE  =  a  +  b  (LT)  ,  (10) 

relates  the  natural  log  of  the  absolute  percent  error  (LAPE)  and  the 
natural  log  of  the  sample  interval  (LT)  in  minutes,  with  a  and  b  re¬ 
gression  constants.  The  other  model's  equation  is 

LAPE  =  a  +  b  (LN)  ,  (11) 

which  relates  the  log  of  the  absolute  percent  error  (LAPE)  to  the  log  of 
the  number  of  samples  taken  (LN),  again  with  a  and  b  regression  con- 

p 

stants.  These  single  variable  regression  models  have  a  R  value  of 
approximately  0.53,  as  seen  in  Table  6.  The  advantage  to  such  simple 
models  is  that  they  require  no  measurement  of  storm  characteristics  from 
preliminary  radar  scans  prior  to  the  start  of  rainfall  measurements. 


Final  regressions  on  the  models  were  done  with  the  combined  data 
sets.  In  Figs.  6  and  7  the  data  indicated  that  a  better  fit  might  be 
found  using  a  second  or  third  order  polynomial  regression.  The  General 
Linear  Models  (GLM)  regression  procedure  (SAS  Institute  Inc.,  1982b)  was 
executed  with  three  models  for  each  of  the  two  independent  variables. 

The  models  were 


Linear 

LnY  =  a  +  b(LnX) 

(12) 

Quadratic 

LnY  =  a  +  b(LnX)  +  c(LnX)2 

(13) 

Cubic 

LnY  =  a  +  b(LnX)  +  c(LnX)2  +  d(LnX)3 

(14) 

where  Y  is  the  dependent  variable,  absolute  percent  error  plus  2  (to 
avoid  Ln(0)  computations),  a,  b,  c,  and  d  are  regression  constants,  and 
X  is  the  independent  variable.  The  variable,  X,  is  the  sample  interval 
in  one  model,  then  the  number  of  samples  in  the  other. 

Regression  analysis  was  done  first  using  the  log  of  the  sample  in¬ 
terval  (LT)  in  minutes  as  the  independent  variable  in  Eqs.  (12),  (13), 

and  (14).  The  results  are  in  the  Appendix  in  Tables  9,  10,  and  11 

2 

respectively.  There  was  no  significant  improvement  in  R  for  the  poly¬ 
nomial  models.  The  quadratic  parameter  had  a  p-value  of  0.2131  which 
makes  it  not  significant  to  the  regression.  In  the  cubic  regression  the 

intercept  and  linear  parameters  were  insignificant  to  the  regression  and 

2  3 

higher  order  terms,  LT  and  LT  ,  were  significant  with  p=values  of 
0.0319  and  0.0392  respectively.  The  polynomials  showed  increased  stan- 
datd  errors  of  estimates  over  the  linear  model.  The  overall  results 
indicate  the  best  regression  is  the  linear  model 


where  APE  is  the  absolute  percent  error  and  T  is  the  sample  interval  in 
minutes.  Here,  2  is  subtracted  because  it  was  added  previously  to  the 
regression  model  to  avoid  Ln(0)  computations.  This  model's  predicted 
error  values  are  plotted  with  the  actual  observed  mean  errors  in  Fig. 

11.  As  the  sample  interval  increases  the  model  underestimates  of  the 
mean  error  increases  significantly.  The  model  underestimates  the  ob¬ 
served  mean  errors  by  over  20%  beyond  sample  intervals  of  25  minutes. 

The  underestimation  is  a  bias  that  is  a  consequence  of  using  the  log 
transforms  for  regressions.  The  bias  makes  Eq.  (15)  useful  for  predic¬ 
tion  of  errors  only  at  sample  intervals  below  approximately  15  minutes. 

Freund  (1977)  states  a  method  which  is  used  to  reduce  the  bias  when 
working  with  log  transformed  data.  A  bias  correction  factor  can  be 
added  before  the  anti logarithms  of  the  estimated  errors  are  taken.  The 
equation  for  this  correction  in  this  case  becomes 

APE*  =  Ln"1  C(LAPE)  +  k(MSE  model)]  -2  (16) 

where  APE*  is  the  bias  corrected  absolute  percent  error,  LAPE  Is  the  log 
of  the  absolute  percent  error  calculated  by  the  regression  equation,  k  = 
0.5,  and  MSE  is  the  mean  square  error  of  the  regression  model.  With  the 
appropriate  numbers  substituted  this  equation  becomes 

APE*  =  Ln-1  [(-1.212+1.161  LT)  +  0.5  (0.761)]  -2  (17) 

where  LT  is  the  log  of  the  sample  Interval  in  minutes.  This  equation 
can  be  written  in  a  multiplicative  form  as 

APE*  «  e'*832  T1,161  -2  (18) 

where  T  is  the  sample  Interval  in  minutes. 


Bias  Corrected  Regression-; 


Fig.  11.  Plot  of  regression  predicted  absolute  percent  error  by 
sample  interval.  Observed  mean  errors,  a  linear  regression  line-of- 
best-fit,  and  the  linear  regression  line  when  corrected  for  log  bias  are 
shown. 


The  bias  corrected  absolute  percent  errors  were  calculated  using 
Eq.  (18).  The  results  are  listed  in  Table  7  and  plotted  in  Fig.  11.  In 
Fig.  11  the  corrected  errors  are  seen  to  closely  approximate  the  obser¬ 
ved  errors.  Although  the  regression  equation  estimates  errors  that  are 
more  in  agreement  with  the  actual  observed  errors,  it  must  be  stated 
that  the  bias  correction  technique  is  not  mathematically  exact  enough  to 
eliminate  totally  all  bias  errors.  With  that  caution  in  mind  useful 
prediction  limits  can  be  drawn  about  the  regression  line. 


Table  7.  Summary,  by  sample  interval,  of  regression  errors,  bias 
corrected  regression  errors,  and  upper  95%  prediction  limits. 


Sample 

Observed 

Regression 

Corrected 

95% 

interval 

errors 

predi cted 

predi cted 

prediction 

(min) 

(*) 

errors  (%) 

errors  (%) 

limits  (%) 

1.4 

.1 

.8 

3 

4.4 

2.3 

4.3 

18 

8.0 

4.9 

8.1 

34 

2.3 

1 

2.1 

50 

6.2 

1 

1 

6.3 

:4.9 

1 

2 

0.6 

86 

11.6 

1 

9.6 

2 

9.5 

123 

[2.0 

2 

5.9 

3 

8.8 

163 

5.4 

3 

2.5 

4 

8.5 

Fig.  12  is  a  plot  of  the  bias  corrected  regression  estimated  errors 
by  sample  interval,  with  the  observed  errors  as  points.  The  upper  95% 
prediction  limit  was  calculated  at  each  sample  interval  point  using  a 
standard  equation  from  Koopmans  (1981).  The  prediction  limit  values  for 
each  sample  interval  are  listed  in  Table  7.  The  width  of  the  prediction 
limit  reflects  the  great  variability  in  the  sample  measurements, 


sample  interval  miN) 


Fig.  12.  Plot  of  the  bias  corrected  regression  predicted  error 
with  the  upper  95%  prediction  limits  by  sample  Intervals.  Mean  observed 
errors  for  each  sample  interval  are  plotted  as 


particularly  at  the  larger  sample  intervals.  This  is  the  limit  that  the 
errors  would  be  expected  to  be  within  95%  of  the  time  for  any  single 
grid  area  sample.  This  prediction  limit  then  clearly  illustrates  the 
wide  error  range  to  be  expected  when  making  total  rainfall  estimations 
for  small  numbers  of  samples  or  small  areas.  For  example,  to  be  assured 
of  less  than  100%  error  (95%  of  the  time)  the  sample  interval  must  be 
less  than  approximately  34  minutes.  Conversely,  a  sample  interval  of  20 
minutes  would  result  in  total  rain  estimation  errors  of  less  than  50%, 
in  95  out  of  100  cases. 

The  log  of  the  number  of  samples  (LN)  was  next  used  as  the  indepen¬ 
dent  variable  in  Eqs.  (12),  (13),  and  (14).  The  results  of  the  three 
regressions  are  in  the  Appendix  in  Tables  12,  13,  and  14  respectively. 
The  linear  model  gives  a  fairly  good  fit  to  the  data.  The  highly  signi¬ 
ficant  p-values,  less  than  0.0001,  and  the  small  standard  errors  are 
evidence  of  a  good  regression  fit. 

2 

There  was  insignificant  improvement  in  R  for  the  higher  order 

2  2 

polynomials.  The  linear  model's  R  B  0.498,  the  quadratic  R  *  0.499, 

2  2 

and  the  cubic  R  *  0.500.  The  quadratic  parameter  LIT  was  not  signifi¬ 
cant  (p  *  .0508),  and  the  standard  errors  Increased.  Thus  the  quadratic 
model's  fit  was  not  as  good  overall  as  the  linear's.  The  cubic  model 
results  indicated  an  even  worse  fit.  The  linear  parameter  LN  was  in¬ 
significant  (p  *  .9062),  and  again  the  higher  order  terms  were  signifi¬ 
cant.  Their  p-values  indicate  that  a  t-statistic  this  large  or  larger 

could  have  been  found  by  chance  alone  In  nearly  3  and  5  cases  out  of 

2  3 

100,  for  the  LN  and  LN  parameters  respectively. 

The  linear  regression  model  is  the  best  in  this  set.  It  can  be 


written  with  regression  constants  as 


APE  =  e4,339  n"1*309  -  2  (19) 

where  N  is  the  number  of  samples  upon  which  the  total  rain  estimate  was 
based,  and  APE  is  the  absolute  percent  error. 

A  plot  of  the  predicted  values  from  this  model  with  the  actual  ob¬ 
served  mean  errors  is  seen  in  Fig.  13.  Obviously  the  same  bias  effects 
are  seen  here,  again  due  to  the  log  transformations  used  in  regressions. 
The  predicted  errors  range  from  20%  less  than  the  actual  sample  errors, 
when  taking  more  than  approximately  nine  samples,  to  over  60%  less  with 
two  or  three  samples  taken.  This  leaves  the  important,  small  number  of 
sample's  errors  relatively  unpredictable  when  using  this  regression 
equation. 

The  bias  correction,  Eq.  (16),  was  used  with  Eq.  (19)  resulting  in 
a  corrected  form 

APE*  *=  Ln"1  [(4.339  -  1.309  LN)  +  0.5  (0.7676)]  -  2  (20) 


where  LN  is  the  log  of  the  number  of  samples.  This  equation  can  also  be 
written  in  a  multiplicative  form  as 


APE* 


e4.723  N-1 .309 


2 


(21) 


where  N  is  the  number  of  samples  taken  fn  80  minutes. 

Using  Eq.  (21),  values  of  the  bias  corrected  regression  estimated 
errors  were  calculated,  then  listed  in  Table  8  and  plotted  in  Fig.  13. 
The  bias  corrected  regression  estimates  are  a  very  near  approximation  to 
the  observed  errors,  as  seen  in  Fig.  13.  This  regression  relation  can 


Regression 


49 


Table  8.  Summary,  by  numbers  of  samples,  of  regression  errors, 
bias  corrected  regression  errors,  and  upper  95%  prediction  limits. 

t     


l 

Number 

of 

samples 

Observed 

errors 

(*) 

Regression 
predi cted 
errors  (%) 

Corrected 
predicted 
errors  (%) 

Upper  95% 
prediction 
limits  (%) 

2 

43.7 

28.9 

43.4 

183.8 

l 

3 

28.3 

16.2 

24.7 

104.4 

4 

16.2 

10.5 

16.3 

68.9 

V 

5 

12.3 

7.3 

11.7 

49.5 

, 

6 

8.0 

5.3 

8.8 

37.2 

i 

9 

4.4 

2.3 

4.3 

18.2 

17 

1.4 

-  0.1 

0.8 

3.2 

then  be  used  to  plot  useful  prediction  limits. 


Fig.  14  is  a  plot  of  the  bias  corrected  estimated  errors  with  their 


associated  upper  95%  prediction  limit  plotted  by  the  number  of  samples 


taken.  The  calculated  values  are  listed  in  Table  8.  The  plot  is  of  the 


same  general  form  as  the  empirically  derived  Fig.  8.  The  prediction 


limit  allows  estimation  of  the  largest  expected  error  of  any  single 


total  rain  sample.  For  example,  the  error  would  be  expected  to  be  less 


than  50%  (95%  of  the  time)  if  at  least  five  samples  were  taken  in  an  80 


minute  time  span.  The  figure  also  illustrates  the  problem  of  over¬ 


sampling,  which  is  actually  the  lack  of  significant  improvement  of  the 


accuracy  of  estimates  made  with  more  than  nine  samples  during  the  80 


mi nutes . 
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CHAPTER  IV 


SUMMARY  AND  CONCLUSIONS 

This  study  investigated  the  effects  of  various  sampling-rates  on 
radar-derived,  total  rainfall  estimates.  Radar  observations,  taken  at 
one  minute  intervals,  were  recorded  for  nine  storms  in  1984.  Total 
rainfall  estimates,  for  10  km  by  10  km  areas,  based  on  these  data  were 
considered  "ground  truth"  totals.  Sample-rates,  ranging  from  5  to  60 
min,  were  applied  to  the  recorded  data  to  calculate  total  rain  estimates 
for  each  sample  rate.  These  derived  rain  totals  were  compared  with  the 
"ground  truth"  totals,  with  the  differences  referred  to  as  "errors." 
These  errors  were  plotted  against  the  sampling-rate  and  the  number  of 
samples  taken.  Other  variables,  investigated  for  high  correlations  with 
the  errors,  were  the  mean  rain  rate,  total  rain,  sequential  variability, 
storm  width,  and  storm  speed  of  movement.  Analyses  of  variance  were 
done  on  subdivisions  of  the  storm  width,  storm  speed,  and  mean  rain  rate 
variables.  Regression  analyses  determined  the  "best"  models,  using 
error  as  the  dependent  variable,  to  allow  prediction  of  the  expected 
errors  In  total  rain  estimates. 

With  this  study's  sample  size  In  mind,  several  conclusions  can  be 
drawn: 

(1)  Large-sample  mean  absolute  errors  of  total  rain  estimates 
ranged  from  nearly  8%,  25%,  and  45%  with  15,  30,  and  60  min  sample- 
rates,  respectively. 

(2)  95%  of  the  Individual  estimate  errors  were  found  to  be  less 
than  approximately  25%,  75%,  and  112%  with  15,  30,  and  60  min 


sample-rates,  respectively. 

(3)  Taking  more  than  8  samples  per  80  min  period  does  not  increase 
significantly  the  accuracy  of  the  measurement.  The  mean  error  for 
greater  than  8  samples  taken  was  less  than  approximately  5%. 

(4)  If  only  2,  4,  or  6  samples  are  taken  in  80  min,  the  error  is 
espected  to  be  less  than  110%,  50%,  or  25%,  respectively,  95%  of  the 
time. 

(5)  The  variables  of  highest  correlation  with  the  errors  were  the 
sample-rate  and  the  number  of  samples  taken,  with  coefficients  of  0.73 
and  -0.73  respectively. 

(6)  The  variables  of  sequential  variability,  mean  rain  rate,  total 
rain,  storm  width,  storm  speed,  and  sampled  storm  depth  had  low  correla¬ 
tions  with  the  errors. 

(7)  Subdivisions  of  the  variables  were  inconclusive  because  of 
small,  unbalanced  sample  sizes. 

(8)  Regression  equations  were  derived  to  relate  the  errors  to  the 
sample-rate  and  the  number  of  samples  taken.  This  allows  prediction  of 
errors  for  individual  total  rain  estimates. 
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Table  9.  Regression  results  on  the  linear  model  LAPE  =  a  +  b(LT). 
LAPE  is  the  natural  log  of  (absolute  percent  errir  +  2),  a  and  b  are 
regression  coefficients,  and  LT  is  the  natural  log  of  the  sample  inter¬ 
val  in  minutes. 


GENERAL  LINEAR  MODELS  PROCEDURE 


DEPENDENT 

VARIABLE: 

LAPE  LOG  (ABSOLUTE  PERCENT  ERR0R*1) 

SOURCE 

OF 

SUM  OF  SOUARES 

MEAN  SOUARE 

F  VALUE 

MODEL 

1 

1325.46786969 

132S. 46786969 

1741 .89 

ERROR 

1726 

1313.37782483 

0.76093733 

PR  >  F 

CORRECTED 

TOTAL 

1727 

2638.84569452 

0.0001 

R-SOUARE 

C.V. 

ROOT  MSE 

LAPE  MEAN 

0.502291 

36.4886 

0.87231722 

2.39065575 

SOURCE 

OF 

TYPE  I  SS 

F  VALUE  PR  >  F 

LT 

1 

1325.46786969 

1741.89  0.0001 

SOURCE 

OF 

TYPE  III  SS 

F  VALUE  PR  >  F 

LT 

1 

1325.46786969 

1741.89  0.0001 

PARAMETER 

ESTIMATE 

T  FOR  HO: 

parameters 

PR  >  | T  | 

STD  ERROR  OF 
ESTIMATE 

INTERCEPT 

LT 

-1.21239178 

1 . 16097819 

-13.65 

41.74 

0.0001 

0.0001 

0.08884346 

0.02781723 
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Table  10.  Regression  results  on  the  quadratic  model  LAPE  » 
a  +  b{LT)  +  c(LT)2. 


GENERAL  LINEAR  MODELS  PROCEDURE 


DEPENDENT  VARIABLE:  LAPE  LOG  (ABSOLUTE  PERCENT  ERROR* 


SOURCE 

DF 

SUM  OF  SOUARES 

MEAN  SOUARE 

F  VALUE 

MODEL 

2 

1326.64815755 

663 . 32407878 

872.00 

ERROR 

1725 

1312 . 19753697 

0-76069422 

PR  >  F 

CORRECTED  TOTAL 

1727 

2638.84569452 

0.0001 

R-SOUARE 

C.V. 

ROOT  MSE 

LAPE  MEAN 

0.502738 

36.4828 

0.87217786 

2 . 39065575 

SOURCE 

OF 

TYPE  I  SS 

F  VALUE  PR  >  F 

LT 

1 

1325.46786969 

1742.45  0.0001 

LT*LT 

1 

1 . 18028786 

1.55  0.2131 

SOURCE 

DF 

TYPE  III  SS 

F  VALUE  PR  >  F 

LT 

1 

14.23054260 

18.71  0.0001 

LT-LT 

1 

1 . 18028786 

1.55  0.2131 

T  FOR  HO: 

PR  >  | T | 

STD  ERROR  OF 

PARAMETER 

ESTIMATE 

PARAMETER-0 

ESTIMATE 

INTERCEPT 

-0.86721488 

-2.98 

0.0029 

0.29099967 

LT 

0.90318417 

4.33 

0.0001 

0.20881930 

LT-LT 

0.04459334 

1.25 

0.2131 

0.03579984 
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Table  11.  Regression  results  on  the  cubic  model  LAPE  s  a  +  b(LT)  + 
c(LT)2  +  d(LT)3. 


GENERAL  LINEAR  MOOELS  PROCEDURE 


DEPENDENT  VARIABLE:  LAPE  LOG  (ABSOLUTE  PERCENT  ERROR*!) 


SOURCE 

DF 

SUM  OF  SOUARES 

MEAN  SOUARE 

F  VALUE 

MODEL 

3 

1329.88234677 

443.29411559 

583.85 

ERROR 

1724 

1308.96334775 

0.75925948 

PR  >  F 

CORRECTED  TOTAL 

1727 

2638.84569452 

0.0001 

R-SOUARE 

C.V. 

ROOT  MSE 

LAPE  MEAN 

0.503964 

36.4484 

0.87135497 

2 . 39065575 

SOURCE 

OF 

TYPE  I  SS 

F 

VALUE  PR  >  F 

LT 

1 

1325.46786969 

1745.74  0.0001 

LT*LT 

1 

1 . 18028786 

1.55  0.2126 

LT*LT*LT 

1 

3.23418922 

4.26  0.0392 

SOURCE 

DF 

TYPE  III  SS 

F 

VALUE  PR  >  F 

LT 

1 

1.46726572 

1.93  0.1647 

LT*LT 

1 

3.49964941 

4.61  0.0319 

LT*LT*LT 

1 

3.23418922 

4.26  0.0392 

T  FOR  HO: 

PR  >  | T | 

STD  ERROR  OF 

PARAMETER 

ESTIMATE 

PARAMETER-0 

ESTIMATE 

INTERCEPT 

1  .54286725 

1.28 

0.2000 

1 . 20338098 

LT 

-1.93063158 

-1.39 

0.  1647 

1.38880136 

LT*LT 

1 .08544392 

2.15 

0.0319 

0.50558040 

LT*LT*LT 

-0.  12111800 

-2.06 

0.0392 

0.05868419 

Table  12.  Regression  results  on  the  linear  model  LAPE  *  a  +  b(LN). 
LAPE  is  the  natural  log  of  (absolute  percent  error  +2),  a  and  b  are 
regression  coefficients,  and  LN  is  the  natural  log  of  the  sample  inter¬ 
val  in  minutes. 


GENERAL  LINEAR  MODELS  PROCEDURE 


DEPENDENT 

VARIABLE : 

LAPE 

LOG  (ABSOLUTE  PERCENT  ERROR+1) 

SOURCE 

DF 

SUM  OF  SOUARES 

MEAN  SOUARE 

F  VALUE 

MODEL 

1 

1314 .02367916 

1314 .02367916 

1711.93 

ERROR 

1726 

1324.82201536 

0.76756780 

PR  >  F 

CORRECTED 

TOTAL 

1727 

2638.84569452 

0.0001 

R-SOUARE 

C.V. 

ROOT  MSE 

LAPE  MEAN 

0  497954 

36.6472 

0.87610947 

2 . 39065575 

SOURCE 

OF 

TYPE  I  SS 

F  VALUE  PR  >  F 

LN 

1 

1314.02367916 

1711.93  0.0001 

SOURCE 

OF 

TYPE  III  SS 

F  VALUE  PR  >  F 

LN 

1 

1314.02367916 

1711.93  0.0001 

PARAMETER 

ESTIMATE 

T  FOR  HO: 
PARAMETERS 

PR  >  |  T  | 

STD  ERROR  OF 
ESTIMATE 

INTERCEPT 

LN 


4 . 33917747 
-1.30856716 


84.10 

-41.38 


0.0001 

0.0001 


005TS9460 

0-03163662 


rt 


Table  13.  Regression  results  on  the  quadratic  model  LAPE  * 
a  +  b(LN)  +  c(LN)2. 


GENERAL  LINEAR  MODELS  PROCEDURE 


DEPENDENT  VARIABLE:  LAPE  LOG  (ABSOLUTE  PERCENT  ERROR*!) 


SOURCE 

OF 

SUM  OF  SOUARES 

MEAN  SQUARE 

F  VALUE 

MODEL 

2 

1316.94990191 

658 . 47495095 

859.27 

ERROR 

1725 

1321.89579261 

0.76631640 

PR  >  F 

CORRECTED  TOTAL 

1727 

2638.84569452 

0.0001 

R-SOUARE 

C.V. 

ROOT  MSE 

LAPE  MEAN 

0.499063 

36.6174 

0.87539500 

2 . 39065575 

SOURCE 

OF 

TYPE  I  SS 

F  VALUE  PR  >  F 

LN 

1 

1314.02367916 

1714.73  0.0001 

LN*LN 

1 

2.92622275 

3.82  0.0508 

SOURCE 

OF 

TYPE  III  SS 

F  VALUE  PR  >  F 

LN 

1 

78.51645282 

102.46  0.0001 

LN*LN 

1 

2.92622275 

3.82  0.0508 

T  FOR  HO: 

PR  >  | T | 

STD  ERROR  OF 

PARAMETER 

ESTIMATE 

PARAMETER-0 

ESTIMATE 

INTERCEPT 

4.55485364 

37.39 

0.0001 

0. 12181660 

LN 

-1.61396511 

-10. 12 

0  0001 

0. 15944756 

LN*LN 

0.08983264 

1.95 

0.0508 

0.04597106 

Table  14.  Regression  results  on  the  cubic  model  LAPE  =  a  +  b(LN)  + 
c(LN)2  +  d(LN)3. 


GENERAL  LINEAR  MODELS  PROCEDURE 
DEPENDENT  VARIABLE:  LAPE  LOG  (ABSOLUTE  PERCENT  ERROR+1) 


SOURCE 

DF 

SUM  OF  SOUARES 

MEAN  SOUARE 

F  VALUE 

MODEL 

3 

1320.63731126 

440  21243709 

575.73 

ERROR 

1724 

131B. 20838327 

0.76462203 

PR  >  F 

CORRECTED  TOTAL 

1727 

2638.84569452 

0.0001 

R-SOUARE 

C  .  V. 

ROOT  MSE 

LAPE  MEAN 

0.500460 

36.5769 

0.87442669 

2  39065575 

SOURCE 

DF 

TYPE  I  SS 

F 

VALUE  PR  >  F 

LN 

1 

1314.02367916 

1718.53  0.0001 

LN*LN 

1 

2.92622275 

3.83  0.0506 

ln*ln*ln 

1 

3.68740935 

4.82  0.0282 

SOURCE 

DF 

TYPE  III  SS 

F 

VALUE  PR  >  F 

LN 

.  1 

0.01061884 

0.01  0.9062 

LN*LN 

1 

3.02136495 

3.95  0.0470 

LN*LN*LN 

1 

3.68740935 

4.B2  0.0282 

T  FOR  HO: 

PR  >  | T | 

STO  ERROR  OF 

PARAMETER 

ESTIMATE 

PARAMETER-0 

ESTIMATE 

INTERCEPT 

3.87278729 

11.61 

0.0001 

0.33357677 

LN 

-0.08421019 

-0.  12 

0.9062 

0.71457775 

LN*LN 

-0.90630316 

-1.99 

0  0470 

0.45592699 

ln*ln*ln 

0.  19282472 

2.20 

0 . 0282 

0.08780625 
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